Term-Weighting for Summarization of Multi-party Spoken Dialogues
نویسندگان
چکیده
This paper explores the issue of term-weighting in the genre of spontaneous, multi-party spoken dialogues, with the intent of using such term-weights in the creation of extractive meeting summaries. The field of text information retrieval has yielded many term-weighting techniques to import for our purposes; this paper implements and compares several of these, namely tf.idf, Residual IDF and Gain. We propose that term-weighting for multi-party dialogues can exploit patterns in word usage among participant speakers, and introduce the su.idf metric as one attempt to do so. Results for all metrics are reported on both manual and automatic speech recognition (ASR) transcripts, and on both the ICSI and AMI meeting corpora.
منابع مشابه
Using Speech-Specific Characteristics for Automatic Speech Summarization
In this thesis we address the challenge of automatically summarizing spontaneous, multi-party spoken dialogues. The experimental hypothesis is that it is advantageous when summarizing such meeting speech to exploit a variety of speech-specific characteristics, rather than simply treating the task as text summarization with a noisy transcript. We begin by investigating which term-weighting metri...
متن کاملAutomatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres
Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain. We address the following issues, which are intrinsic to s...
متن کاملEntrainment in Multi-Party Spoken Dialogues at Multiple Linguistic Levels
Linguistic entrainment, the phenomena whereby dialogue partners speak more similarly to each other in a variety of dimensions, is key to the success and naturalness of interactions. While there is considerable evidence for both lexical and acoustic-prosodic entrainment, little work has been conducted to investigate the relationship between these two different modalities using the same measures ...
متن کاملModelling and Detecting Decisions in Multi-party Dialogue
We describe a process for automatically detecting decision-making sub-dialogues in transcripts of multi-party, human-human meetings. Extending our previous work on action item identification, we propose a structured approach that takes into account the different roles utterances play in the decisionmaking process. We show that this structured approach outperforms the accuracy achieved by existi...
متن کاملDIASUMM: Flexible Summarization of Spontaneous Dialogues in Unrestricted Domains
In this paper, we present a summa.rization system for spontaneous dialogues which consists of a novel multi-stage architectm'e. It is specifically aimed at addressing issues related to tlle nature of the l;exts being spoken vs. written and being diMogical vs. monologica.l. The system is embedded in a. graphical user interface ~md was developed and tested on transcripts of recorded telephone con...
متن کامل